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Abstract 

Set and multiset variables in constraint programming have 
typically been represented using subset bounds. However, 
this is a weak representation that neglects potentially use- 
ful information about a set such as its cardinality. For set 
variables, the length-lex (LL) representation successfully pro- 
vides information about the length (cardinality) and position 
in the lexicographic ordering. For multiset variables, where 
elements can be repeated, we consider richer representa- 
tions that take into account additional information. We study 
eight different representations in which we maintain bounds 
according to one of the eight different orderings: length- 
(co)lex (LL/LC), variety-(co)lex (VL/VC), length-variety- 
(co)lex (LVL/LVC), and variety-length-(co)lex (VLL/VLC) 
orderings. These representations integrate together informa- 
tion about the cardinality, variety (number of distinct ele- 
ments in the multiset), and position in some total order- 
ing. Theoretical and empirical comparisons of expressiveness 
and compactness of the eight representations suggest that 
length-variety-(co)lex (LVL/LVC) and variety-length-(co)lex 
(VLL/VLC) usually give tighter bounds after constraint prop- 
agation. We implement the eight representations and evaluate 
them against the subset bounds representation with cardinal- 
ity and variety reasoning. Results demonstrate that they offer 
significantly better pruning and runtime. 

Introduction 

In constraint programming, we often need to model multi- 
sets (or bags) of objects. For example, in the template de- 
sign problem (prob002 in CSPLib dGent and Walsh 1999) ), 
we need to construct printing templates, which are multisets 
of different designs. Multisets, unlike sets, can contain rep- 
etition of elements. For popular designs, we may have mul- 
tiple copies on the same template. Surprisingly, whilst there 
has been significant progress on developing representations 
for sets, relatively little research has been done on how best 
to represent multisets. 

Sadler and Gervet (2004) proposed representing set vari- 
ables with subset, lexicographic, and cardinality bounds. In- 
deed, they suggested that such a representation could also be 
used for multisets (2008). However, little detail is provided 
about how to do this exactly. To compare two multisets, they 
lexicographically compare their occurrence vectors written 



in decreasing order. For instance, {3,3,2,1,1} < {4} < 
{4, 4}. Gervet and Van Hentenryck (2006) proposed repre- 
senting set variables using length-lex bounds, arguing that it 
provides comparable pruning to the aforementioned hybrid 
domains at a fraction of the computational cost. It is there- 
fore promising to consider length-lex and related bounds for 
multiset variables. However, as a number of different order- 
ings are possible, we have undertaken a theoretical and em- 
pirical comparison of the most promising options. 

As multisets permit repeated elements, we can incorpo- 
rate information about the variety (number of distinct ele- 
ments) ( |Law, Lee, and Woo 2009) in addition to the cardi- 
nality and position in the lexicographic ordering. As a re- 
sult, we introduce eight different representations for multi- 
set variables in which we maintain bounds according to one 
of eight different orderings: length-(co)lex (LL/LC), variety - 
(co)lex (VL/VC), length-variety-(co)lex (LVL/LVC), and 
variety-length-(co)lex (VLL/VLC) orderings. These bounds 
provide information about the possible cardinality, variety, 
and position in the (co)lexicographic ordering of a multiset. 
We evaluate the expressiveness (whether the set of multisets 
can be exactly represented) and compactness (whether the 
interval is minimal) of the eight representations both theo- 
retically and empirically. Our results suggest that LVL/LVC 
and VLL/VLC representations are usually more expressive 
and more compact than LL/LC and VL/VC respectively. 
The eight representations give total orderings on multisets, 
which make enforcing bounds consistency on multiset vari- 
ables possible. However, when we attempt to enforce bounds 
consistency on the bounds of the proposed representations, 
this operation can be NP-hard even on unary constraints. To 
test out these representations, we implement the eight repre- 
sentations and evaluate them against the subset bounds rep- 
resentation with cardinality and variety reasoning. Results 
confirm that these new representations achieve significantly 
better pruning and runtime. 
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Set Variables 



Copyright © 2011, Association for the Advancement of Artificial 
Intelligence (www.aaai.org). All rights reserved. 



A set is an unordered list of elements without repetition. 
The cardinality of a set S is the number of elements in S, 
denoted as Gervet ( 1997) proposed to represent the do- 
main of a set variable S with an interval [glb(S) , lub(S)] 



such that Ds — {m | glb(S) C m C lub(S)}. The greatest 
lower bound glb(S) contains all the elements which must 
exist in the set, while the least upper bound lub(S) con- 
tains any element which can exist in the set. S is said to 
be bound when its lower bound equals its upper bound (i.e., 
glb(S) = lub(S)). In this subset bounds representation, the 
set domain is ordered partially under C. It also neglects the 
cardinality and the position in lexicographic ordering which 
can be important in many problems. Thus, Gervet and Van 
Hentenryck (2006) proposed to totally order a set domain 
with a length-lex ordering. This representation incorporates 
the cardinality and the position in lexicographic ordering di- 
rectly, giving tighter bounds when enforcing bounds consis- 
tency. 

Notation Given a universe U of integers {1, . . . , n}, set 
variables, denoted as Si, takes their values from U. Sets are 
denoted by letters s, t, x, and y. A subset s of U of cardinal- 
ity c is denoted by {s%, 82, ■ ■ ■ , s c } where Sx < S2 < • • • < 

s c . 

Length-lex Ordering The length-lex ordering < totally 
orders sets first by cardinality and then lexicographically. 

Definition 1. A length-lex ordering d is defined by: 

s < t iff s = V \s\ < \t\V (\s\ = \t\ A (sx < txV sx = 

tx As\{ Sl } ±t\{tx})). 

Definition 2. Given a universe U, a length-lex interval is a 
pair of sets (m, M) which represents the sets between m and 
M in the length-lex ordering (i.e., {s C U \ m d s d M}). 

Given a universe U = {1, ...,4}, the sets are or- 
dered as follows: ^ {1} < {2} d {3} d {4} 
d {1,2} d {1,3} d {1,4} * {2,3} * {2,4} d 
{3,4} d {1,2,3} d {1,2,4} d {1,3,4} < {2,3,4} 
r< {1,2,3,4}. The length-lex interval ({1, 2}, {3, 4}) de- 
notes the set {{1, 2}, {1, 3}, {1, 4}, {2, 3}, {2, 4}, {3, 4}}. 

Multiset Variables 

A multiset is a generalization of set that allows elements 
to repeat. Without loss of generality, we assume that mul- 
tiset elements are positive integers from 1 to n. We shall 
use to denote both the empty set and the empty multi- 
set. The universe of a multiset is a multiset itself, which 
defines the maximum possible occurrences of each ele- 
ment. Given a universe U, we denote a multiset S as 
S = \mx, mii ' ' ' , TO c} where m, < nij for 1 < 
i < i < c, its cardinality (total number of elements) 
as and its variety (total number of distinct elements) 
flLaw, Lee, and Woo 2009| > as ||5||. For example, if S = 
§1,1,2,2,3}, then \S\ = 5 and \\S\\ = 3. Since an el- 
ement in a multiset variable can occur multiple times, we 
let occ(i, S) be the number of occurrences of an element i 
in the multiset S. Walsh (120031 1 proposed using an occur- 
rence vector (occ(l, S), . . . , occ(n, S)) to represent a mul- 
tiset variable with n elements. For example, the occurrence 
representation for the value {{1, 1, 2, 2, 3} with the universe 
U = #1,1, 2, 2, 3, 3}} is (occ(l,S),occ(2,S),occ(3,S)) = 
(2,2,1). 

Note that a set value can also be represented using the oc- 
currence representation in which the number of occurrence 



is either or 1 to denote the existence of the corresponding 
element. Thus, we adopt the occurrence representation for 
multiset variables and order the occurrence vector to give 
various orderings in multisets. 

Lex-induced Orderings in Multisets 

The length-lex representation for sets incorporates informa- 
tion about the length (cardinality) and position in the lexi- 
cographic ordering. Such a representation can be extended 
to include the variety information since multisets allow re- 
peated elements. This gives a total of eight different ways 
to order multisets. In the following, we formally define the 
eight orderings, in which four of them order the position lex- 
icographically and the other four colexicographically. 

Lex Orderings 

The lex ordering ^ totally orders multisets lexicograph- 
ically. Here, we assume the multisets are represented by 
the occurrence representation (i.e., the number of oc- 
currences of each element are stored in an occurrence 
vector). Thus, given two multisets x and y, we com- 
pare their occurrence vectors (occ(l, x), . . . , occ(n, x)) and 
(occ(l, y), . . . , occ(n, y)) from the first position to the last. 

Definition 3. A lex ordering ^ is defined by : 

x V iff (x — y) V (3i, occ(i, x) < occ(i,y) A Vj < 
i,occ(j,x) = occ(j,y)). 

For example, consider two multisets x = §1,2,2} 
and y = §1,3,3}. Their occurrence vectors are (1,2,0) 
and (1,0,2) respectively. §1,3,3} ^ §1, 2, 2} because 

occ(l, y) — occ(l, x) and occ(2, y) < occ(2, x). 

Colex Orderings 

Contrary to the lex ordering, the colex ordering < c com- 
pares the occurrence vectors of two multisets from the last 
position to the^zrsf. 

Definition 4. A colex ordering < c is defined by: 

x die y iff (x = y) V (3i, occ(i, x) < occ(i,y) A Vj > 
i,occ(j,x) = occ(j, y)). 

For example, let two multisets x = §1,3,3} and y = 
§2, 3, 3} with occurrence vectors (1, 0, 2) and (0, 1, 2) re- 
spectively. They are ordered as §1, 3, 3} ^ c §2, 3, 3} be- 
cause occ(3, x) = occ(3, y) and occ(2, x) < occ(2, y). 

Induced Orderings 

Given a total order <p on a set of multisets, we can have 
four different ^-induced orderings when we integrate -<p 
with cardinality and/or variety of multisets. 

Length-/? Ordering The length-fi ordering <ip totally or- 
ders multisets first by their cardinality, and then by the f3 
ordering: x <ip y iff \x\ < \y\ V (\x\ = \y\ A x dp y). 

Variety-/? Ordering The variety-^ ordering < v p totally 
orders multisets first by their variety, and then by the j3 or- 
dering: x dv/3 y iff \\x\\ < \\y\\ V (||x|| = \\y\\ A x dp y). 

Length-/? and variety-/? prefer cardinality and variety over 
the /? ordering respectively. In fact, both cardinality and va- 
riety can be considered together, giving two more orderings. 



Length-variety-/? Ordering The length-variety- j3 order- 
ing dilvp totally orders multisets first by their cardinality, 
then by their variety, and then by the f3 ordering: x -<lvfi V 
iff \x\ < \y\ V (|ac| = \y\ A x -< v p y). 

Variety-length-/3 Ordering The variety-length- (3 order- 
ing ~< v ip totally orders multisets first by their variety, then 
by their cardinality, and then by the f3 ordering: x < v ip y iff 
IN < IM|V(N = \\y\\Ax± w y). 

Since lex and colex orderings are total orders, we can 
have eight different orderings by substituting f3 by the lex 
and colex orderings. For example, substituting (3 by the 
lex ordering in the length-/? ordering gives the length- 
lex ordering LL (<u)- Similarly, we can have variety-lex 
VL (< v i), length-variety-lex LVL (diivi), variety-length-lex 
VLL (divii), length-colex LC (r<; c ), variety-colex VC (^ c )> 
length-variety-colex LVC (d?lvc)> an d variety-length-colex 
VLC (divlc) orderings. 

The above eight orderings are applicable to multisets. All 
the four colex orderings on multisets reduce to the LL or- 
dering on sets introduced by Gervet and Van Hentenryck 
(2006 ). Note that, when we consider a fixed length, the colex 
{resp. lex) ordering for set values is equivalent to order- 
ing the occurrence vector lexicographically {resp. colexico- 
graphically). 

The domain of a multiset variable is simply a set of mul- 
tisets. We can thus totally order the domain values of a vari- 
able according to the eight orderings. To illustrate the differ- 
ences, TableQ]lists the domain of a multiset variable S with 
universe U = -{{1, 2, 2, 3, 3]} in the four lex orderings. Take 
the LVL ordering as an example. We first order the multi- 
sets by their cardinality. Thus, has cardinality and is the 
first multiset, followed by the multisets with cardinalities 1, 
2, and so on. For multisets of the same cardinality, we then 
compare their variety. Consider the segment with cardinal- 
ity 2, i.e., from {[3, 3]} to {1, 2}. The multisets §3, 3jj- and 
{2, 2}} are ordered before {2, 3}, {1, 3}}, and §1, 2}} be- 
cause the former two have variety 1 and the latter ones have 
variety 2. Lastly, we order the multisets lexicographically. 
The occurrence vectors of §3, 3§ and §2,2]j- are (0,0,2) 
and (0, 2, 0) respectively. Thus, §3, 3]} <i v i §2, 2]} because 
occ{l, {[3, 3 J) = occ(l, 12, 2}) = and occ(2, f 3, 3}) < 
occ(2,{2,2}). 

Given a multiset variable, we can approximate its domain, 
which is a set S of multisets, with an a-interval, where 
a refers to one of the above eight orderings. The interval 
{m,M) a must contain all the multisets in S such that m 
and M are the lower and upper bounds of S respectively. 
We also define the a-closure of S which is the minimal pos- 
sible a-interval containing S. 

Definition 5. Given an a ordering, an a-interval (m, M) a 

is a set of multisets defined by (m, M) a — {x\m < a x ^ a 
M}. The a-closure of S is defined by cl a (S) = (m, M) a , 
where S C (m, M) a and there does not exist m -< a m! 
and M' ^ a M such that (m ^ ml or M ^ M') and S C 
(m',M') a . 

Definition 6. An a representation of a set S of multisets is 
cl a (S). An a representation of S is exact if S — cl a (S). 



For example, let the universe U = {[1, 2, 2, 3, 3}j- and 
S = {{{1}}, 1 2, 2}, 12, 3}}. The M-closure of S is the Ivl- 
interval ({[ljj-, §2, 3]f)i v i. This representation is not exact, 
as the interval contains the multiset {[ 3, 3]} ^ S. 

Expressiveness 

An exact representation gives the tightest possible bounds 
and contains no undesired values. It is often the case that 
a set of multisets can be exactly represented using one rep- 
resentation but not using a different representation. In this 
section, we compare the eight representations to see which 
ordering is better in terms of the notion "expressiveness". 

Definition 7. ( Walsh 2003 ) Given a universe U and two 
different multiset representations A and B. A is said to 
be as expressive as B if MS C U,(S — cIa(S)) o 
(S = cIb(S)). A is said to be more expressive than B if 
MS C U, (S = cl B (S)) -> (S = cl A (S)) and 3S C 
U, (S = cIa(S)) A(S / cl B (S)). A and B are incom- 
parable ;/ neither one of them is more expressive than the 
other. 

The following propositions compare the expressiveness of 
the eight representations under the conditions that the cardi- 
nality and/or variety of a set of multisets is fixed. 

Proposition 1. When both the cardinality and variety are 
fixed, ( i) the LVL/LVC representation is as expressive as the 
VLL/VLC representation, (ii) the LVL/LVC and VLUVLC 
representations are more expressive than the LL/LC and 
VL/VC representations respectively, and ( Hi) the LVL is as 
expressive as the LVC and the VLL is as expressive as the 
VLC. 

The results in Proposition Q] can be demonstrated using 
the example in Table Q] When the cardinality and variety 
are 2 and 1 respectively, the LVL and VLL representations 
can exactly represent {§2, 2]}, §3, 3]}} by the /^/-interval 
(£3, 3fl, f 2, 2}) lvl and the ^-interval (f 3, 3j, £2, 2}) vU 
respectively. However, the LL and VL representations 
give the ^/-interval (§3, 3J, §2, 2 ]})« and the uZ-interval 
({[3, 3]}, §2, 2j)„; respectively, in which both contain the 
additional undesired value §2, 3]}. 

The following two propositions relax the conditions to the 
case that either the cardinality or the variety is fixed. 

Proposition 2. When the cardinality is fixed, (i) the 
LVL/LVC representation is more expressive than the 
VLL/VLC, LL/LC, and VL/VC representations, and ( ii) the 
LL representation is as expressive as the LC representation. 

Proposition 3. When the variety is fixed, (i) the VLL/VLC 
representation is more expressive than the LVL/LVC, LL/LC, 
and VL/VC representations, and ( ii) the VL representation is 
as expressive as the VC representation. 

In Table Q] when the cardinality is 3, the LVL represen- 
tation can exactly represent the multisets by the /^/-interval 
({2, 3, 3}, 11, 2, 3})i v i, while the VLL, LL, or VL repre- 
sentations cannot. There are additional undesired values in 
their corresponding intervals. In fact, when only the variety 
is fixed, we obtain similar results. Suppose the variety is 2, 
the VLL representation can exactly represent the multisets 



Table 1 : The four lex orderings for the domain of a multiset variable S with universe U — -fl, 2, 2, 3, 3}} 



Length-lex (LL) 


9 in <U f2}j ±u m ±u {{3,31 =<« {{2,31 — ^ {{2,2}} ±i {{1,3}} 
-</i ill 211 -<n #2 3 311 -<// #2 2 31 -<// 1(1 3 311 -<n 111 2 31 -<» #1 2 21 
f2,2,3,3S ±u fl,2,3,3l ±1 41.2,2,3} <n f 1,2,2,3,3}} 


Variety-lex (VL) 


Ivl m <ui {{3,31 <vi {{2}} f2,21 <vi {{11 {{2,31 <vi {{2,3,3}} 
-< i f 2 2 3H- -< / f 2 2 3 3H -< i ill 3H- -< r ill 3 3ft -< i ffl 2H -< i iil 2 2t 
f 1, 2, 31 r<„ ; f 1, 2, 3, 31 < v i {{1, 2, 2, 3}} {{1, 2, 2, 3, 3} 


Length- variety-lex (LVL) 


r<;„i {{3}} ± vl {{2}} ;<,,., {{1}} ± M 43,31 X,,,, {{2,21 X M 42, 3 J ;<„,, f 1,3}} 
;<,„, 41,21 ± vl 42,3,31 X w 42,2,31 r<(„i 41,3,31 <M 41,2,21 X M 41,2,31 
±m 42, 2, 3, 31 ± v i {{1, 2, 3, 31 < M 41, 2, 2, 31 < M f 1, 2, 2, 3, 31 


Variety-length-lex (VLL) 


=<„ u {{3}} :<„« 421 =<„« {{1}} 43,31 < M {{2,21 {[2,31 =<„u 41,31 

<m 41,21 42,3,31 X„„ 42,2,31 ;<„,, 41,3,31 ;<„„ {{1,2,21 

r<„„ 42,2,3,31 r<„ii {{1,2,31 <M {{1,2,3,31 ±M {{1, 2, 2, 3}} r< t ,« {{1,2,2,3,31 



by the u/Z-interval (42, 3J, 42, 2, 3, 3»„h, while the LVL, 
LL, or VL representations cannot. 

Compactness 

The notion of expressiveness concerns the exactness of the 
representation. However, a domain D of a multiset variable 
might not be exactly represented using any of the eight rep- 
resentations, i.e., D C cl a (D). In such cases, cl a (D) is an 
approximation that contains some undesired values, and our 
expressiveness notion does not apply. In this section, we de- 
fine a new notion called compactness to compare the eight 
representations. This definition is based on a comparison 
of the size of the domains, and is different from the no- 
tion of dominance which is based on the size of search tree 
(IJefferson 20071 1. 

Definition 8. Given a universe U and two different multiset 
representations A and B. A is as compact as B iffiS C 
U, \cIa(S)\ = \cIb(S)\. A is more compact than B ifVS C 
U,\cl A (S)\ < \cl B (S)\ andBS C U,\cl A (S)\ < \cl B (S)\. 
A and B are compactly incomparable if neither one of them 
is more compact than the other. 

The following proposition characterizes the compactness 
of the eight orderings. 

Proposition 4. (i) The LVL/LVC representation is more 
compact than the LL/LC representation and compactly 
incomparable to the VLL/VLC representation, (ii) The 
VLL/VLC representation is more compact than the VL/VC 
representation. ( Hi) The LL/LC representation is compactly 
incomparable to the VL/VC representation. 

In Table Q] suppose we want to represent the set S of all 
multisets whose variety is 2. Both the LVL and LL repre- 
sentations cannot exactly represent S and give a a-closure 
with the same lower and upper bounds (i.e., 42, 3]} and 
42, 2, 3, 34 respectively). Both Ivl- and ^/-intervals con- 
tain undesired values. By comparing their compactness, 
\chvi(S)\ — 9 < \clu(S)\ = 10. The LVL representation 
is more compact than LL representation. 

Using the VL/VLL representations for multiset variables 
would be useful when we have tight constraints on the vari- 
eties of the multiset variables. For instance, Law, Lee, and 
Woo (2009) demonstrated the value of this on extended 
Steiner system problems in which there are tight constraints 



over the varieties. On the other hand, the LL/LVL represen- 
tations would favour the kind of problems with more cardi- 
nality restrictions or with variables having fixed cardinali- 
ties. 

Empirical Comparisons 

Before we apply the eight representations to model and solve 
multiset problems, we first empirically evaluate their ex- 
pressiveness and compactness. We perform experiments to 
compare the size of the eight representations of a set D of 
multisets when different cardinality and variety constraints 
are imposed. In the experiment, the universe U is a multiset 
which contains 10 occurrences of elements 1 to 5. For all in- 
stances, D is a randomly generated subset of the power set 
of U. The comparison aims at measuring the compactness 
of different representations in approximating D. We record 
|d Q (D)|, the number of multisets in the a-closure of D that 
satisfies the cardinality and the variety constraints, where a 
refers to the eight representations: LL, LC, VL, VC, LVL, 
LVC, VLL, and VLC. Due to space limitation, we summa- 
rize the observations as follows. 

When both cardinality and variety are fixed, the LVL/LVC 
and VLL/VLC representations can always exactly repre- 
sent the domain values, giving the corresponding mini- 
mal a-interval cl a (S). For all instances, the LVL/LVC and 
VLL/VLC representations demonstrate a large reduction in 
the domain size when compared with the LL/LC and VL/VC 
representations. 

When the variety is fixed, the VLL/VLC ordering first 
considers the variety of each multiset and narrows down the 
bounds to a larger extent by removing the multisets with 
unwanted varieties. For each variety, the multisets are then 
ordered by their cardinality, which allows further pruning 
of the multisets with undesired cardinalities on the domain 
bounds. Thus, the VLL/VLC representation can always give 
the exact representation and achieve on average one to two 
orders of magnitude reduction in the domain size when com- 
pared with the LL/LC and VL/VC representations. In con- 
trast, the LVL/LVC representation can always give the exact 
representation when the cardinality is fixed. 

When the cardinality and variety are constrained to cer- 
tain ranges, although all eight representations fail to give 
the exact representation for all instances, the LVL/LVC 
and VLL/VLC representations are more compact than the 



LL/LC and VL/V C representations respectively. 

To conclude, the LVL/LVC and VLL/VLC representa- 
tions are always more compact than the LL/LC and VL/VC 
respectively. This means that they will usually give tighter 
bounds during constraint propagation. In the following, we 
study how the eight representations behave in practice as 
bounds propagation in a multiset solver. 

Bounds Consistency 

Since a multiset domain is totally ordered in the eight repre- 
sentations, we can enforce bounds consistency. To be more 
precise, we define bounds consistency on a /c-ary constraint 
on multiset variables (for any fc). 

Definition 9. Bounds Consistency (BC) 
Let S± , . . . , S n be multiset variables with interval domains 
D(Si) = (ms i , Ms t ). Given a constraint C over S%, . . . , S n 
and an a ordering, a value rrii for variable Si has an a- 
bound support (mi, . . . , m n ) if the support satisfies C and 
Vmj,TO Si ^ Q mi ^ Q M Si . 

The constraint C is bounds consistent iff for each Si, both 
and M$i have a-bound supports. 

The eight representations offer greater expressiveness, but 
we have to be careful that reasoning remains tractable. In- 
deed, even with a single unary constraint, we can get in- 
tractability. 

Theorem 1. There exists a constraint on one set variable 
such that enforcing BC on subset bounds is polynomial but 
enforcing BC on LL bounds is NP-hard. 

Proof. Reduction from 3-SAT with N variables, X\ to Xn 
and M clauses. We construct a set variable S with elements 
that have the following meaning: 2% represents a truth as- 
signment in which Xi is true whilst 2i — l represents a truth 
assignment in which X; is false (1 < i < N), and each in- 
teger above 2N represents one of the (polynomial number 
of) distinct clauses. We consider an unary constraint on this 
set variable which is satisfied only when the set contains in- 
tegers representing a proper truth assignment (that is, 2i GS 
iff 2i — 1 ^ S for 1 < i < N) and this assignment satis- 
fies the clauses represented by the integers in the set greater 
than 2N, or the set contains integers representing a superset 
of a proper truth assignment (that is, either 2i or 2i — 1 or 
both occur in S for 1 < i < N). Subset bounds are polyno- 
mial to compute since, if the upper bound includes a proper 
truth assignment, we leave the upper bound untouched and 
adjust the lower bound to include any necessary elements in 
linear time and, where needed, check the truth assignment. 
On the other hand, if the upper bound does not include a 
proper truth assignment, the unary constraint has no support. 
By comparison, length-lex bounds are NP-hard to compute. 
We consider domains that fix the possible and necessary el- 
ements to be the clause that we wish to decide, and make 
none of the other integers necessary but all of them possible. 
Then, enforcing bound consistency on the length-lex bounds 
will allow us to decide the satisfiability of the original for- 
mula. □ 



It is worth noting that the opposite does not hold. If LL 
bounds are polynomial to compute, then subset bounds are 
too. 

Theorem 2. Given an n-ary constraint on set and/or multi- 
set variables. If enforcing BC on LL bounds is polynomial, 
then enforcing BC on subset bounds is also polynomial. 

Proof, (sketch) Let the possible values of a set variable 
S be {1, . . . , n}. We can convert subset bounds into LL 
bounds easily by ordering the sets first by cardinality and 
then lexicographically. This operation is polynomial. Af- 
ter enforcing BC on LL bounds, we can then convert LL 
bounds back to subset bounds using the inclusion propagator 
(Gervet and Van Hentenryck 2006). Such conversion is also 
polynomial. Thus, if enforcing BC on LL bounds is poly- 
nomial, then enforcing BC on subset bounds is also polyno- 
mial. □ 

With two unary constraints, Sellmann's Lemma 1 shows 
that finding the fixpoint on the LL representation of a single 
set variable is NP-hard (Sellmann 2009). Given the above 
theorems, enforcing BC on LL bounds is NP-hard. How- 
ever, exponential-time propagation algorithms may still help 
reduce runtimes ( |Yip and Van Hentenryck 2010 1. 

Here, we show an example on how BC works on the do- 
mains in the LL and LVL representations. 

Given the universe U = {1, 1, 1, 2, 2, 2, 3, 3, 3} and mul- 
tiset variables X, Y, and Z. The constraints are: \X\ = 
\Y\ = \Z\ = 3, \\Z\\ = 1, and I fl 7 = Z. 
The initial domains are D{X) = D(Y) = D(Z) = 
(0,11,1,1,2,2,2,3,3,3})/!,/. In LVL representation, en- 
forcing \X\ = \Y\ = \Z\ = 3 tightens the bounds to 
have cardinality 3, i.e., D(X) = D(Y) = D(Z) = 
({[ 3, 3, 3§, {1, 2, 3 ]})/„/. The bounds corresponds to the oc- 
currence vectors (0,0,3) and (3,0,0). Since [[|1,2,3||| ^ 
1, the upper bound of Z is updated to §1, 1, 1 J, resulting 
D{Z) = (p, 3, 3}}, #1,1, This triggers the prop- 
agation on X n Y = Z and tightens the upper bounds 
of X and Y. After constraint propagation, X = Y = 
(§3, 3, 3^-, §1, 1, lj)i v i. Now, the problem is bounds con- 
sistent and \D(X)\ = \D(Y)\ = \D(Z)\ = 3. However, in 
the LL representation, the problem is bounds consistent after 
enforcing the cardinality constraint \X\ = \Y\ = \Z\ = 3. 
D(X) = D(Y) = D(Z) = (#3,3,3MM,1J>« and 
\D(X)\ = \D(Y)\ = \D(Z)\ = 10. Thus, different repre- 
sentations result in different domain size after enforcing BC, 
and LVL gives a tighter bound than LL in this example. 

Experimental Results 

To verify the feasibility and efficiency of our proposal, we 
adapt and simplify the implementation of the length-lex rep- 
resentation for set variables ( Van Hentenryck et al. 2008] ) to 
implement the eight representations (LL, LVL, VL, VLL, 
LC LVC, VC, V LC) for multiset variables in ILOG Solver 
6.0 (IILOG 20031 1. We have also developed the ternary inter- 
section (X n Y = Z) and unionplus (X ttl Y = Z) multiset 
constraints, which are not available in the original LL imple- 
mentation. 



Table 2: Experimental results of the extended Steiner system. 





SB+CR+VR 


LL 


LVL 


VL 


VLL 


t,k,u,b,v 


Fail 


Time 


Fail 


Time 


Fail 


Time 


Fail 


Time 


Fail 


Time 


2,4,5,4,2 


57329 


3.59 


19187 


1.48 


2930 


0.34 


3790 


95.37 


2945 


3.38 


2,4,5,5,2 


356785 


28.71 


89768 


10.04 


19718 


3.13 


30755 


541.13 


19991 


14.32 


3,4,4,4,2 


1710 


0.1 


942 


0.08 


278 


0.03 


309 


1.77 


305 


0.58 


3,4,4,5,2 


30034 


2.36 


13541 


1.39 


658 


0.11 


922 


20.33 


729 


15.13 


3,4,5,5,3 


312397 


22.17 


38109 


5.84 


12195 


1.36 






12363 


7.23 


3,4,5,6,3 


2108410 


190.15 


281911 


57.83 


103163 


13.39 






106145 


63.83 


3,4,5,7,3 


9813128 


1097 


1352165 


380.42 


384145 


63.05 






398511 


285.16 



Table 3: Experimental results of the generalized social golfer problem. 





SB+CR+VR 


LL 


LVL 


VL 


VLL 


w,m,n,g,p,v 


Fail 


Time 


Fail 


Time 


Fail 


Time 


Fail 


Time 


Fail 


Time 


3,3,3,2,4,2 


14934 


1.61 


15108 


0.94 


14479 


0.87 


2171 


0.44 


2395 


0.27 


3,3,4,2,4,2 


394570 


40.29 


111102 


6.41 


103756 


5.59 


39 


0.06 


39 


0.05 


3,3,4,2,5,2 


185839 


20.32 


181801 


12.37 


172818 


11.27 


11536 


8.61 


12428 


2.84 


4,3,4,2,4,2 






14071439 


1003.03 


12983736 


874.96 


151132 


78.47 


151132 


41.6 


4,3,4,2,5,2 






12818684 


1103 


12496315 


1046.14 


1035895 


437.89 


1098395 


173.74 


3,4,3,2,4,3 


2631024 


348.04 


1889782 


129.28 


1510939 


94.21 


21 


0.28 


21 


0.29 


3,4,4,2,4,3 






4062535 


280.02 


3339400 


210.61 


27 


3.99 


27 


3.95 



We perform experiments on the extended Steiner sys- 
tem and the generalized social golfer problem. They are 
run on a Sun Blade 2500 (2 x 1.6GHz US-IIIi) worksta- 
tion with 2GB memory. We report the number of fails (i.e., 
the number of backtracks occurred in solving a model) and 
CPU time in seconds to find and prove the optimal solution 
for each instance. Comparisons are made among the sub- 
set bounds representations with cardinality-variety reason- 
ing (SB+CR+VR) ( |Law, Lee, and Woo 2009} and the eight 
representations we have implemented. Since the results of 
the four colex representations (LC, LVC, VC, VLC) are sim- 
ilar to their corresponding lex counterparts (LL, LVL, VL, 
VLL), they are not reported in the tables. In the tables, the 
first column shows the problem instances. The subsequent 
columns show the results of using various representations. 
The best number of fails and CPU time among the results 
for each instance are highlighted in bold. A cell labeled with 
"-" denotes a timeout after 20 minutes. 

The extended Steiner system ES(t, k, u, b), an 
important and practical multiset problem in in- 
formation retrieval ([Johnson and Mendelsohn 19721 
IBennett and Mendelsohn 19801 IPark and Blake 20081 1, is a 
collection of b blocks. Each block is a fc-element multiset 
drawn from a w-element set whose elements can be drawn 
multiple times. For every two blocks in the collection, the 
cardinality of their intersection must be smaller than t. 
We adapt the problem to become an optimization problem 
which maximizes the sum of the varieties of the multisets. 
To further increase difficulty, we constrain each multiset 
variable to have variety at least v. 

The generalized social golfer problem SG(w, m, n, g,p) 
extends the social golfer problem (probOlO in CSPLib 
(IGent and Walsh 19991 1 ) from sets to multiset, in which we 
schedule m teams of n members to g groups of p golfers 
over w weeks. Each group contains golfers from different 
teams and they play against each other. To maximize the so- 
cialization, the number of times two teams meet with each 



other again is minimized. Similar to the extended Steiner 
system, each multiset variable is constrained to have variety 
at least v. 

Tables [2] and [3] show the experimental results of the ex- 
tended Steiner system and the generalized social golfer 
problem respectively. All the four lex representations 
give fewer number of fails and faster runtime than the 
SB+CR+VR ( |Law, Lee, and Woo 2009l >. This confirms that 
the lex representations take advantage of the cardinality 
and variety information to give tighter bounds than the 
SB+CR+VR. 

In the extended Steiner system, the LVL representation al- 
ways achieves the fewest number of fails. There is about a 
95% reduction in the number of fails when compared to the 
SB+CR+VR. The LVL representation achieves fewer num- 
ber of fails than the VLL representation because the problem 
has tighter constraints on the cardinalities than the varieties 
of the multiset variables. 

When comparing the results between LL and LVL, the 
latter performs better. This is because in the LVL represen- 
tation, the multisets are ordered according to their varieties 
under the same cardinality. When enforcing BC, the mul- 
tisets with the same varieties can be pruned together when 
they violate the variety constraints. However, in the LL rep- 
resentation, these multisets are scattered over the ordering 
and we cannot remove all of them from the domain at the 
same time, thus resulting in a larger search tree and number 
of fails. Similarly, VLL performs better than VL. 

The instances listed in Tableware all satisfiable. In our ex- 
periments, there are some unsatisfiable instances, in which 
the number of fails and runtime of LVL and VLL can be 
slightly larger than LL and VL respectively. We also tried 
to fix both cardinalities and varieties of the multiset vari- 
ables. Since the multisets are ordered lexicographically un- 
der a fixed cardinality and variety, LVL and VLL give the 
same number of fails. 

For the generalized social golfer problem, VL and VLL 



perform better than LL and LVL because the problem has 
tighter constraints on the varieties than the cardinalities 
of the multiset variables. Since there are much more con- 
straints in the problem when compared to those in the ex- 
tended Steiner system, the generalized social golfer problem 
is more complicated. We observe that the VL representation 
always achieves the fewest number of fails. However, the 
VLL representation has the fastest runtime because the ex- 
tra prunings in the VL representation cannot compensate the 
overhead in finding new bounds of multiset variables. 

Conclusion 

We have proposed eight representations for multiset vari- 
ables, which integrate together information about the car- 
dinality, variety, and position in the (co)lexicographic order- 
ing. We have made a detailed comparison of the expressive- 
ness and compactness between the eight different represen- 
tations. The LVL/LVC and VLL/VLC representations are al- 
ways more expressive and more compact than the LL/LC 
and VL/VC representations. Compactness is a new notion 
which lets us compare inexact representations. We have also 
performed experiments on some benchmark problems. Ex- 
perimental results confirm that LVL and VLL usually give 
tighter bounds during constraint propagation, resulting in 
smaller search trees and better runtimes. In some cases, LVL 
performs better, and sometimes VLL. It would be interesting 
to study if the two representations can be linked together so 
that we can take advantage of each representation. 
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